Modeling Content from Human-Verified Blacklists for Accurate Zero-Hour Phish Detection
نویسندگان
چکیده
Phishing attacks are a significant security threat to users of the Internet, causing tremendous economic loss every year. Past work in academia has not been adopted by industry in part due to concerns about liability over false positives. However, blacklist-based methods heavily used in industry are slow in responding to new phish attacks, and tend to be easily overwhelmed by phishing techniques such as fast-flux and the proliferation of toolkits. In this paper, we present the design and evaluation of two blacklist-enhanced content-based algorithms. The key insight behind our algorithms is to leverage existing human-verified whitelists and blacklists, and relax them via probabilistic methods to attain high true positive rates while maintaining extremely low false positive rates. Comprehensive experiments over a diverse spectrum of data sources show that our approach currently achieves a false positive rate of 0.0434% with a true positive rate of 87.42%. Our algorithms are able to adapt quickly to new phishing attacks by incremental retraining, and present a new framework that will generalize to evolving attacks.
منابع مشابه
A Hierarchical Adaptive Probabilistic Approach for Zero Hour Phish Detection
Phishing attacks are a significant threat to users of the Internet, causing tremendous economic loss every year. In combating phish, industry relies heavily on manual verification to achieve a low false positive rate, which, however, tends to be slow in responding to the huge volume of unique phishing URLs created by toolkits. Our goal here is to combine the best aspects of human verified black...
متن کاملA Hybrid Approach to Detect Zero Day Phishing Websites
Phishing is a significant problem that tricks unsuspecting users into revealing private information involving fraudulent email and websites. This causes tremendous economic loss every year. In this paper, we proposed a novel hybrid phish detection method based on phishing blacklists and phishing properties. We used some fresh phish from PhishTank that were recently added to test that it can be ...
متن کاملAn Empirical Analysis of Phishing Blacklists
In this paper, we study the effectiveness of phishing blacklists. We used 191 fresh phish that were less than 30 minutes old to conduct two tests on eight anti-phishing toolbars. We found that 63% of the phishing campaigns in our dataset lasted less than two hours. Blacklists were ineffective when protecting users initially, as most of them caught less than 20% of phish at hour zero. We also fo...
متن کاملAuntieTuna: Personalized Content-based Phishing Detection
Phishing sites masquerade as copies of legitimate sites (“targets”) to fool people into sharing sensitive information that can then be used for fraud. Current phishing defenses can be ineffective, with training ignored, blacklists of discovered, bad sites too slow to pick up new threats, and whitelists of knowngood sites too limiting. We have developed a new technique that automatically builds ...
متن کاملPoster: Lightweight Content-based Phishing Detection
I. INTRODUCTION Increasing use of Internet banking and shopping by a broad spectrum of users results in greater potential profits from phishing attacks. Phish are fake websites that masquerade as legitimate sites, to trick unsuspecting users into sharing sensitive information: credentials, passwords, financial information, or other personal information that can enable fraud. This threat is espe...
متن کامل